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FIELD OF THE INVENTION 

The subject invention relates generally to data processing and more 
particularly to a method and apparatus providing high speed parallel accessing of data 
stored at a number of remote heterogeneous sites and automatic, pipelined execution 
5 of successive methods on such data. 

BACKGROUND OF THE INVENTION AND RELATED ART 

Present technology is witnessing the development of large remote 
databases or "data warehouses", as well as rapid expansion of the Internet and 

10 proliferation of corporate intranets. Demand is growing for increasingly large and 
rapid data transfers involving streaming video, visualization graphics and large data 
warehouse downloads over such new network protocols as the Fast Ethernet and 
Gigabyte Ethernet. The data which it would be desirable to access may be stored 
across heterogeneous sites, i.e., sites which contain different types of database 

1 5 systems or other data containers. Hence the data which may need to be accessed may 
be referred to as "heterogeneous data.". 

At the same time as demand has grown for large and rapid data 
transfers, there has been constant pressure to simplify the user interface to a vast array 
of components and data storage facilities. While individual components in a particular 

20 solution are often easy to use, combining them in a complete solution still presents 
extremely complex problems to the user. 

In addition to simplifying the user interface to heterogeneous data and 
complex arrays of components, it appears desirable to provide the user with added 
capabilities to readily command and perform more powerful automated data 

25 processing operations, in addition to simple "search" queries. 



SUMMARY OF THE INVENTION 

Our co-pending application, U.S. Serial No. entitled Method 

Apparatus For High Speed Parallel Accessing And Execution of Methods Across 
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Multiple Heterogeneous Data Sources provides the capability to access distributed 
data contained in a number of distributed heterogeneous data sources via a search 
initiated by a single JAVA Script wherein a single object represents the data to be 
retrieved and subjected to a method in the Script. According to the subject invention, 
5 successive methods set out in a script are executed automatically across distributed 
data, with the results of one execution automatically piped to the next method. An 
example is the search of a data object followed by an automatic sort of the results of 
the search, followed by the e-mail of the results of the sort, wherein the data to be 
searched is distributed across a plurality of nodes, each node having a different type of 
10 database. 

The invention finds one application in a system employing metadata- 
based high level abstraction of a federation of clustered or distributed heterogeneous 
databases and/or data files in which the federation of databases is referenced or treated 
as a single object, as well as in an apparatus for parallel data access and concurrent 
15 execution of object methods across the distributed data. The single object is 

referenced hereafter as the "data source object," sometimes abbreviated to simply 
"data object." 

Still other objects, features and advantages of the present invention will 
become readily apparent to those skilled in the art from the following detailed 

20 description, wherein is shown and described only the preferred embodiment of the 

invention, simply by way of illustration of the best mode contemplated of carrying out 
the invention. As will be realized, the invention is capable of other and different 
embodiments, and its several details are capable of modifications in various obvious 
respects, all without departing from the invention. Accordingly, the drawings and 

25 description are to be regarded as illustrative in nature, and not as restrictive, and what 
is intended to be protected by Letters Patent is set forth in the appended claims. The 
present invention will become apparent when taken in conjunction with the following 
description and attached drawings, wherein like characters indicate like parts, and 
which drawings form a part of this application. 
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BRIEF DESCRIPTTON OF THE DRAWINGS: 

Figure 1 is a system block diagram illustrating a method and apparatus 
5 according to the preferred embodiment of the invention; 

Figure 2 is a flow diagram illustrating structure and operation of an 
agent process according to the preferred embodiment; 

Figure 3 is a block diagram illustrating system architecture according 
to the preferred embodiment; 
1 0 Figure 4 is a flow diagram illustrating a messenger process according 

to the preferred embodiment; 

Figure 5 is an inheritance diagram illustrating metadata employed 
according to the preferred embodiment; 

Figure 6 is a schematic block diagram illustrating a node employing a 
15 static start-up process; 

Figure 7 is a schematic block diagram illustrating a node employing a 
dynamic start-up process; 

Figure 8 illustrates operation of an agent process at a local node in 
response to a request containing concatenated methods; 

Figure 9 illustrates operation of an agent process at a remote node in 
response to a message generated according to Fig. 8 and; 

Figures lOA-C and 1 1 A-D illustrate respective Java Studio design 
panels adapted to form part of an alternate embodiment. 

25 DETAILED DESCRIPTION OF ONE EMRODIMENT: 

Figure 1 illustrates a plurality of remote sites or nodes 11, 13, 15, 17 
wherein data to be retrieved or accessed is typically spread across the respective 
nodes. In the illustrative example of Figure 1, the data at node 1 1 comprises 
Microsoft NT files, the data at node 13 comprises an Oracle database, the data at node 
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15 comprises an SQL Server database, and the data at node 17 comprises a Microsoft 
Access database. 

In one example of operation of the system of Figure 1, a user at user 
site or node 19 propounds a simple request which automatically sets in motion 
5 concurrent parallel accessing of all the remote databases 1 1 , 1 3, 1 5, 1 7. The request 
illustrated in Figure 1 is a search request and the parallel searches are referenced 
respectively as Search 1, Search 2, Search 3 and Search 4. The searches provide 
parallel access to the heterogeneous data using a metadata approach and treating the 
heterogeneous data as if it were a single object. The simple query or request is first 
1 0 interpreted so as to pass the relevant part of the script from a user node across to the 
remote nodes. In the embodiment under discussion, queries or requests are presented 
as JAVA scripts. 

Each of the searches is optimized with respect to the underlying data. 
For example, there are number of ways of accessing the Oracle database, such as via 

1 5 an OBDC connection or via the Oracle Call Interface. According to the preferred 
embodiment, the method used to access the Oracle database is via the Oracle Call 
Interface. This method is optimum for the purpose of the preferred embodiment 
because it provides the shortest path length to the data. Thus, standard database 
interfaces are used, while selecting the one which provides the shortest path length. 

20 The user writing the query statement is unaware of the approach used to actually 
access the data. 

The metadata describes the contents of the data object of a request 
(query). The metadata is contained in a repository 18, using data object models which 
describe the overall federation of servers and data sources. In the preferred 
25 embodiment, there are four categories of data source objects: 

Distributed over the nodes of a cluster 
Distributed over a network 

Distributed over an SMP (symmetric multiprocessor) 
Not distributed 
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A distributed network can be an Ethernet or nodes on a cluster or a gigabit/sec 
connection. 

A repository application generates a set of data source descriptor files 
automatically from the metadata at run-time. The data descriptor files contain only 
5 the metadata corresponding to the data source object contained in the user-written 
script. 

The descriptor files are held locally in NT flat files, and are used at 

run-time in the interpretation of the query requests. The use of optimized local files 

fiirther supports high nm-time performance. The repository used according to the 
1 0 preferred embodiment is the Unisys Repository (UREP). Various other repositories 

could be used such as Microsoft's or a standard one such as is being developed by the 

Object Management Group. 

The descriptor file name is also used as the name of the data object in 

the query scripts, which data object represents the highest level of abstraction of the 
1 5 federation of data in question. For example, the descriptor file corresponding to an 

object, cluster population, would be called "cluster population." A user might write a 

query, for example: 

cluster.population.search (if (bdate = xx/xx/xx)), 
searching the population (perhaps the population of the United States) for all persons 

20 with a particular birthdate. As discussed in ore detail below, an "agent" interpreting 
this script will refer to the local descriptor file, cluster.population, to determine the 
nature of the object. 

In the case of Figure 1, the metadata indicates that the data is contained 
in the SQL Server, Oracle and/or NT files databases 1 1, 13, 15 and sets forth the 

25 organization of all the data in the respective databases, e.g. the columns and rows and 
how to interpret the data stored in the database. Accordingly, the user at site 19 does 
not need to know the data structure and is thus writing applications at a transparent 
level, i.e., treating the whole network as a single object and writing methods on it. ~ 
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A special interpreter or "agent" process is employed at the local or user 
site, which interprets the script/request and "looks up" the appropriate metadata from 
the NT descriptor file. The local agent then sends appropriate scripts to the particular 
nodes which contain data corresponding to the data object. An agent (interpreter) 
5 module located at each remote node interprets and executes received scripts. 

Each agent comprises a module of code (an NT process or the 
equivalent in another operating system). Thus, two levels of interpretation are 
employed, a first to interpret the script and a second to interpret and execute the 
interpreted script at the appropriate nodes. As much processing as possible is 
10 performed close to the data, i.e., at the physical sites where the data is stored, in order 
to minimize message traffic between user and nodes. Thus, a function shipping model 
is used. 

According to the example being discussed in connection with Figure 1, 
the agent at each remote site, 11, 13, 15, 17 receives the interpreted client request, 

1 5 which includes a data source object name and the methods to be applied, which were 
originally embedded in the script generated by the user. The remote agent determines 
from the data source object (1) whether the data is distributed, and if so, (2) the way in 
which it is distributed. These details (1) and (2) are contained in the repository 18 of 
metadata. Once armed with items (1) and (2), the remote agent performs the required 

20 method(s) upon the data. 

The first level (local) interpretation of the two level interpretation 
process will now be fiirther detailed in conjunction with Figure 2 and an illustrative 
example of operation according to the preferred embodiment of the invention. This 
example considers the client request as being received by an agent at the user site 19, 

25 although the request could be received by an agent at a remote site. 

According to step 31 of Figure 2, an agent at the user site 19 first 
receives the client request, which, in the preferred embodiment is in the form of a Java 
script. The agent at the user site 1 9 then interprets the script. The data source object 
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name (e.g., C_sql_ data) is embedded in the script, as are the methods to be invoked 
on the referenced data source (e.g., "sort" in C_sql_data.sort(state(d)). 

The data source object is categorized by whether it is distributed, and 
the way in which it is distributed. The category of the data source object is specified 
5 in the data source descriptor file. As noted above, the latter is a text file with the same 
name as the data source object itself, i.e., C_sql_data. 

At the beginning of the local interpretation of the script, the local agent 
imports the descriptor file, step 33 of Figure 2. In step 35, the local agent examines 
the descriptor file and determines the next processing step, depending on the category 
10 of the referenced data source obj ect. 

If, in step 35, the data source category is determined to be 
"distributed," the agent proceeds to step 37 and breaks the script into new scripts 
appropriate to the designated nodes. The new scripts are then sent to the designated 
nodes for fijrther concurrent processing, step 38, 39 of Figure 2. The agent on the 
1 5 processing node checks the data source type to determine the next processing step 
(there are three data source types: NT file system, SQL Server, Oracle) - and then 
proceeds with the processing. 

If, in step 35, the local agent determines that the data source is non- 
distributed, the agent proceeds to the test 41 to check to see if the data source location 
20 is local or not. If not local, the agent passes the script unchanged to the designated 
node, step 45, if local, the agent checks the data source type for next processing step 
and proceeds with processing, step 45. 

The following code provides an example of local interpretation of the 
user script, C_sql_data.sort(state(d)), presented at node 1 of a cluster: 

25 

#import C_sql_data 
mainO { 

C_sql_data.search(if (b date == xx/xx/xx)) 
30 : ~ - 

Descriptor file C_sql_data resembles: 
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SERVER = l(sql_data), 3 (sql_data), 5 (nt_data) 
} 

Descriptor file sql_data resembles: 

SERVER = 1 ; MS; sql_data = publishrauthors; 
5 { 

au_id* unique CHARACTER(11) 
State* null CHARACTER(2) 

10 } 

According to this example, a data source object, C_sql_data, is 
searched for persons with a particular birthdate. A data source descriptor file, with the 
same name as the data source object, indicates that C_sql_data is distributed across 
15 Nodes servers 1,3,5 of a cluster. Descriptor files on each node give details of the data 
distributed on that node (in this case, the data is in SQL Server databases on servers 1 
and 3, and in an NT file system on server 5). 

The agent on local server 1 begms execution of the script by importing 
the data source descriptor file, C_sql_data. The category of the data is "cluster," the 
20 hosting server is "1" with the data distributed on servers 1,3 and 5. The agent 

processes the statement. In due course, the agent will check the syntax and verify, for 
example, that "b_date" is specified as a column in the descriptor of the sql_data 
object. 

In processing the statement, the agent breaks the script into 
25 sql_data.searchO for server 1; 

sql data. searchQ for server 2; 

nt data. searchQ for server 5 

The agent on server 1 processes the first statement; the second 
statement is sent to server 3; and the third statement is sent to server 5. There is an 
30 object with a descriptor file name, sql_data, on server 3 and an object with a 

descriptor file name nt_data on server 5. After the processing (sortmg) at each node, 
the information is returned to the original (coordinating) agent for final processing. _ 
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By using a function shipping model, in which the search commands are 
sent to be executed as close to the data as possible, and only the results ("hits") are 
returned to the requester, the network traffic is minimized (compared with a data 
shipping model, in which all the data might be sent to the requester, and the search 
performed there). In the event that updates are involved, the approach also ensures 
that there will never be a later update in another server's cache, thus maintaining 
cache coherency across servers. 

Figure 3 provides an illustrative system architecture. According to 
Figure 3, a Visual Basic client 51, a browser 55, or an Active Server Page, interfaces 
to an ActiveX component 53. The client sets information to describe its request (e.g., 
the name of a file containing a script to be executed) in a table within the ActiveX 
component 53 and calls a "send" method within the component. The ActiveX 
component 53 interfaces with a Messenger code module 59 via a Sockets interface. In 
this way, the apparatus appears to the client to be an ActiveX component. 

The "messenger" 59 listens for messages fi'om the Sockets interface 
57, and its operation is illustrated in connection with Figure 4. This module of code 
contains two key NT or Unix threads (or the equivalent for other operating systems): a 
send thread and a receive thread. The receive thread listens for new messages fi-om a 
client or firom an agent. The send thread returns results to the client, or sends requests 
to another server. 

As indicated by steps 63, 65, 67 of Figure 4, on receiving a message 
fi-om the Sockets interface 57, the messenger 59 queues the request for interpretation 
by an "agent" process 61, which analyzes the message and performs the request. If, 
on receipt of a message, the messenger 59 detects that all agent processes are busy at 
test 69, additional agents may be created, step 71, up to a maximum, using standard 
NT or Unix or equivalent operating system process initiation calls. If all agents are 
not busy, the next available agent process will interpret the request, as indicated by 
step 73. 
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On detecting that the data is distributed, the agent breaks the script into 
the appropriate scripts for each data source as discussed above and queues a request to 
the "messenger" process to send these scripts to the respective distributed servers to 
be processed in parallel. Thus, if successive "NO's" occur at tests 65 and 75 of Figure 
5 4, and a "YES" results at test 79, parallel requests are sent out. The receiving 
"messenger" process at the destination server queues the request to an "assistant 
agent" (which differs from an "agent" only in that it is always invoked from, and 
replies to, another "agent," rather than to an external client). The assistant agent 
interprets the script (for example, a "search" of local data), queuing the results and 

1 0 presenting a request to the local "messenger" for return to the requesting agent. 

Thus, when test 83 of Figure 4 is satisfied, results are returned to the 
local messenger in step 84 where the results are then consolidated. The agent may 
then request the messenger to return results to the client, test 75, step 77. In this way, 
automatic execution of methods is achieved across distributed heterogeneous data (in 

1 5 NT files, SQL server, Oracle,. . .) transparently to the requester without the writer of 
the request (script) having to be aware of where the data is located, how it is accessed, 
where the methods execute or how they were created. If the data is distributed, the 
execution runs automatically in parallel. With implementation of the agent and 
messenger models on different operating systems, the servers may run on a 

20 heterogeneous mix of NT, Unix, 2200, A-Series, IBM,. . . etc. 

Figure 5 is an inheritance diagram fiirther illustrating organization of 
the metadata according to the embodiment under discussion. The box labeled "UREP 
Named Version Object" 201 represents the highest level of abstraction in the UREP 
and comprises a collection of data objects. The diagram of Figure 5 illustrates the 

25 basic concept that each data object contains embedded data and methods (operations) 
appUed against the data where the data fiirther consists of attributes and types. 

Figure 5 illustrates a second level of abstraction 212, which includes 
derived classes identified as System Node 202, System Server 203, Data Source 
Object 204, Field Desc 205 and System Script 206. Thus, each data object has 
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associated therewith information as to the system node(s) where it resides, the system 
servers within a node which access it, its attribute as being distributed or 
nondistributed, the field descriptors for NT files and the methods associated with it. 

The System Node class 202 includes information sufficient to describe 
5 each node in a cluster including attributes such as the Node Address which may, for 
example, represent an internet port sufficient to locate a node in question. The class 
202 further includes constructQ and destructQ methods to create or destroy a node. 

The System Server class 203 includes all attributes and parameters 
regarding each server which resides on a node, where the "server" comprises the 
10 messenger, agent and assistant agent codes, i.e., everything necessary to receive a 
script and to execute it. The server attribute illustrated in Figure 5 is the server port, 
which is tiie address (node and port) at which incoming messages are "listened for" by 
the messenger of the server in question. 

The Data Source Object 204 comprises the names used for various 
1 5 objects in the script. The attribute "DSC category" indicates whether the particular 
object is distributed (207) or nondistributed (208). A distributed object 207 further 
includes subclasses 209, 210 as to the type of distribution, i.e., across SMP nodes or 
across nodes of a cluster. The "ObjList" attribute gives a list of the databases 
contained within the distributed data source name. In other words, the object name is 
20 broken down into sub-names which exist on the different nodes. 

Non Distributed Data Sources 208 typically are either NT files 21 1 or a 
relational database object 213, which flirther break down into column, index, table 
and size schema 215, 216, 217, 21 8 as known to those skilled in the art. 

The Script class 206 contains the location of any otherwise 
25 unrecognized programs or methods and could contain programs or methods contained 
in URL's, in CORBA ORB environments, X/OPEN OLTP environments, as well as 
in local or remote NT executables or other script files. 

Thus, a system Node contains one or more servers, each of which hosts 
its own set of Data Source Objects. The relationships represented in Figure 5 and 
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contained in the metadata indicate what Data Source Objects are related to which 
servers and thus supply the information necessary to create the local data source 
descriptor files at run-time. 

The information represented by Figure 5 is preferably captured at 
5 system set-up using a graphical interface xrnder control of a system administrator with 
as much automation as possible in order to avoid mmecessary data entry. For 
example, such an interface provides automatic scanning of the rows and columns of a 
relational database. Once set up, the system runs applications automatically as 
illustrated herein. 

10 The metadata may also include the location of otherwise unrecognized 

services, the API's (application programming interfaces) or protocols to be used in 
invoking services (effectively wrapping the "foreign" services). Services may also be 
sought in trading (OMG, ODP, etc.) networks, allov^ng a broad heterogeneity of 
service access, execution and creation. In this way, services invoked as a simple 

1 5 JAVA method may actually have been provided in Open/OLTP, Corba obj ects, 

Microsoft DCOM/COM+, Sun EJB, Line, MAPPER,. . ., or other environments. In 
this respect, an infi-astructure is provided akin to a parallel nervous system for the 
invocation and integration of heterogeneous services (invoked as JAVA methods). A 
system according to the preferred embodiment can span platforms, OS's, and 

20 architectures vdthout a requirement for changes in the underlying OS. 

In an implementation according to Figure 6, servers implementing the 
preferred embodiment run on all the nodes of a system which may be, for example, a 
cluster, a Uniysis cellular multiprocessing system (CMP), a network, or an SMP 
(symmetrical multiprocessor). The servers are preferably started by executing a 

25 program, "jobstart," from any node in the system. "Jobstart" calls an NT service, 
registered as "Start Service" automatically at "boot" time on each of the systems 
nodes, defined m a configuration file. The "Start Service" serves as a listener on the 
host node in question, performmg the loading and invocation of the local runtime 
processes comprising the messenger and agent. Multiple processes may be activated, 



-14- 



PATENT DOCKET IJNI6-BI30/04MV1089 

automatically, in the same node depending on performance considerations. As soon 
as the servers have been activated, the runtime process is ready to accept client 
requests. 

In Figure 7, the configuration of Figure 6 is shown supplemented by a 
5 repository (UREP). Instead of a static start-up of all the servers in the system, a 

dynamic invocation, based on the client (user) request, is now provided. Based on the 
data source name (data object) suppHed in the client request, the server to which the 
cUent application is attached, in processing the user request, retrieves from the 
repository the details of the locations which support the data source. The Agent 

10 process interpreting the scripts then dynamically activates only the servers required to 
support the user's request. The Agent is shown interacting with a DBMS (Database 
Management System). A hardware component suitable for implementing the system 
servers in a system like that of Figures 1, 6 or 7 is the Aquanta as manufactured by 
Unisys Corporation, Bluebell, Pennsylvania. 

1 5 The Messenger is loaded and activated by the local NT service (the 

Start Service) on each node in the system. Initially, the cUent application, responding 
to a user's request, establishes a comiection, via the WinSock interface, with this 
process (server). The server (process) acts as a "messenger" between the client and 
the agent process for the particular user. The "messenger" performs four key 

20 functions: 

Acts as the "Hstener" to receive user requests from the client or 
from an agent on another node. 

Sends the results of the request back to the submitter of the 
request (the client or an agent on another node). 
25 - Manages the creation of, and the assignment of tasks to, agent 

and assistant processes. 

Sends and receives messages to and from these agents and 
assistants, using shared memory. 
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As noted above, the Agent process accepts and sends messages from 
and to the request queue, maintained by the messenger. As illustrated above, the key 
functions performed by the agent are to parse and process each request in the JAVA 
script, often resulting in operations on named data sources within the system which 

5 may be heterogeneous (e.g., in NT files, SQL Server, Oracle,. . .) and distributed. In 
so doing, the agent looks up the descriptor of the data source. If the data is distributed 
across multiple nodes, the agent rewrites the script as multiple scripts. Each of the 
latter scripts consists of the operations, for a particular node specified in the 
descriptor, to be performed on the data sets residing in that node. These scripts are 

10 then sent to the "assistant" processes on these other nodes in accordance with the 
"fimction shipping" model. The system will typically be configured to run with an 
initial number of agent processes, with a maximum also specified. 

In Figures 6 and 7 "node" is used to describe the physical hardware, 
e.g., an Aquanta server (as in a "node on the network" or a "node of a cluster"). A 

1 5 server is the "apparatus" residing on that node comprising the messenger, agent and 
assistant code modules. Multiple servers may reside on a single node. The servers 
may be viewed as comprismg part of a "federation." A federation is a group of 
servers which have access to the same data, objects and scripts. There may be more 
than one federation in a system. 

20 Figure 8 illustrates processing of a script which contains multiple successive 

or "concatenated" methods. In test 31 of Figure 8, the metadata is checked by the 
agent at the local site to determine whether the data source is distributed. Test 3 1 
corresponds to test 31 of Figure 2. 

In Step 301, the local agent scans the script. In test 303, the local agent 

25 determines whether successive methods are included in the script. If not, the routine 
proceeds to step 35 of Figure 2 of the original application. 

If successive methods are involved, the flow proceeds to step 305 where the 
local agent determines which methods should be performed at the remote sites. This 
determination is preferably made by accessmg a simple table which indicates whether 
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a selected method should be performed remotely adjacent the data or at the user site 
upon the returned results. 

In step 307, the statement is broken into scripts appropriate to the servers at 
the remote nodes. For example, one may propound the statement: 
5 population.search( ).sort( ).mail( ) 

to search, for example, the population of the United States for people with particular 
attributes, sort the results of the search, and then mail the results of the sort. In such 
case, if the data in "population" were distributed across databases in servers on nodes 
1 , 3 and 5, the script: 
10 population.search( ).sort( ). 

is sent to the servers at each of the nodes 1, 3 and 5. Thus, in this example, the local 
agent has determined from a table that "search" and "sort" are methods designated for 
performance at the remote sites, and has generated an appropriate script to send to 
each of the sites. 

1 5 The assistant agent at each of the remote servers on nodes 1 , 3 and 5 then 

interprets the respective script and, on finding the successive methods, search( ).sort( 
), performs the first method (search( )) and then leaves the results of that method 
stored in memory, rather tiian causing the results to be returned to the coordinating 
local agent. The second (or further) method(s) are then performed on the results of 

20 the earlier method(s), and only when the results of the succession of methods are 

complete, are the results returned to be merged by the coordinating agent. In this way, 
if the data object ("population") is distributed, the methods (search, sort) are 
performed automatically in parallel on the distributed data. 

An example of operation of the remote agent is illustrated in Figure 9. The 

25 data object "population" 403, 405, 407 is retrieved at each of three respective nodes: 
Node 1 , Node 2 and Node 3. The method "search( )" is performed by the remote 
agent on each respective data object, producing respective search results 409, 411,413 
stored temporarily in memory at each of the respective Nodes. The remote agent then 
executes "sort( )" on each of the respective search results, yielding respective sort 
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results 41 5, 417, 419. The remote agents then transfer the respective sort results to the 
respective remote messengers, which return them to a coordinating agent at the 
originating site. The coordinating agent creates the merged results 421 and executes 
the Mail method to e-mail the final results. 

The search, sort and mail methods are described further in connection with the 
following discussion of a preferred set of methods performed by the Agent and 
simmiarized in the following table: 
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TABLE I 
SYSTEM METHODS 





SOL Server 


NT File 


ORACLE 


MNT 


CLUSTER 


TEMP 


1 


info 


info 


info 


info 


info 


info 


2 


format 


format 


format 


format 


format 


format 


3 


groupby 


groupby 


groupby 


groupby 


groupby 


groupby 


4 


compute 


compute 


compute 


compute 


compute 


compute 


5 


search 


search 


search 


search 


search 


search 


6 


sort 


sort 


sort 


sort 


sort 


sort 


7 


load 


load 


load 


load 


load 


load 


8 


copy 


copy 


copy 


copy 


copy 


copy 


9 


extract 


extract 


extract 


extract 


extract 


extract 


10 


remove 


remove 


remove 


remove 


remove 


remove 


11 


modify 


modify 


modify 


modify 


modify 


modify 


12 


join 


join 


join 


join 


join 


join 


13 


mpl 


mpl 


mpl 


mpl 


mpl 


mpl 


14 


size 


size 


size 


size 


size 


size 



The provision of such a set of key methods (the basic components of 
5 every application) greatly enhances the ease of application development. Additional 
methods employed are reflected in the following Table IL In addition, of course, use 
written methods may be invoked. 



CLUSTER TEMP 
adon adon 
adto adto 



TABLE II 

SOL Server NT File ORACLE MNT 

1 adon adon adon adon 

2 adto adto adto adto 
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3 


find 


find 


find 


find 


find 


fmd 


4 


update 


update 


update 


update 


update 


update 


5 


first 


first 


first 


first 


first 


first 


6 




mail 








mail 


7 
8 




print 
save 








print 
save 


9 




saveObject 








saveObject 


10 




SaveScript 








SaveScript 


11 




read 








read 


12 




write 








write 


13 




close 








close 


14 




index 








index 



Further, according to the preferred embodiment, the following logic, 
controls, environmental variables and commands are provided: 
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TABLE III 
SYSTEM COMMANDS 





Logic 


control 


environment 


commands 


5 


if 


import 


time$ 


sleep 0 




else 


local 


date$ 


close 0 




while 


main 


day$ 


write 0 




try 


private 


mon$ 


system Q 




catch 




year$ 


format Q 


10 


continue 




wday$ 


audio 0 




break 




mday$ 


debug 0 



dir$ mpl Q 



trace 0 

15 

Further discussion of exemplary implementation of various controls, 
commands and methods will further illustrate the utility, structure and operation, and 
advantages of the preferred embodiment. 

20 #IMPORT: control 

The import control is used to identify a data object from the repository. 
The object name following the import statement is a unique name in the system. As 
noted, the different data sotirces processed according to the preferred embodiment are 
SQL server, Oracle, NT files, and Multiply NT files. The data object may, of course, 

25 include other data sources. 

The import statement makes available to the script interpreter (agent) 
all the metadata files describing the contents of the selected data object. The first 
occurrence of "import" causes the appropriate data source descriptor file to be set up 
at each node of the system. Each such descriptor file containing the metadata 
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corresponding to the data source object in the user written script. The API required to 
access the data is also determined within this statement. The developer never has to 
be concerned about the type of data, whether the data is local, clustered, or even if the 
data is moved or columns modified. Because the script is interpreted at run time, all 
current colimm sizes, data location, API types, whether to use parallel methods and 
etc. are all handled dynamically. 

Example: 

#import Persoimel 
main() { 

Persormel.sort ( birthday ) ; 
} / /main 

In the above example, . . . 
#LOCAL: control 

This control is used to identify a temporary data object or record set. 
The object name following the local statement will be a unique name for this script. 
The different temporary data sources which can be processed are SQL server, Oracle, 
NT files, clustered data sources and multiple NT files. 

The LOCAL statement makes all connections required for this data 
object. It is possible for a data object to consist of Oracle and SQL server or any other 
data source. If there are multiple data sources, all connections are handled by this 
statement. The API required to access the data is also determined within this 
statement. The developer never has to be concerned about the type of data, whether 
the data is local, clustered, or even if the data is moved or columns modified. Because 
the script is interpreted at run time all current colimm sizes, data location, API types 
whether to use parallel methods, etc., are all handled dynamically. 
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Example: 

#import Person 

#local recset Result = new Person; 

#local recset Rslt2 = {Person.first_name Person.last_name} 
5 #local nt picture = @data/bitmaps/picture.bmp{} 

#local ms table = data_source:tablename{l_namecharacter(10) 
f_name character (15) ssn character(9)} 

main Q { 
10 Result.load(Person); 

Rslt2.1oad(Result); 
} //main 



DATA OBJECT IDENTTFTF.RS: 

The following data object identifiers are used on the local control. 
This allows the interpreter to know which API to use to reference the data object. 

IDENTIFIER DATA API 

MS Microsoft SQL Server 

ORA Oracle 

NT NT Files 

RECSET Temporary Table 
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SCRIPT VARIABLE TYPES: 

The following script variable types are supported: 

CHAR 

INTEGER 
5 SHORT 

LONG 

FLOAT 

REAL 

DOUBLE 
10 DECIMAL 

NUMERIC 

BYTE 

STRING 

RECORD 
15 DATE 

TIME 

TIMESTAMP 



SOL DATA TYPES : 

The following SQL data types in record databases are supported: 



SOLtvpe 


Size 


Data Tvpe 


SQL_CHAR 


1 


char 


SQL_VARCHAR 


1 


char 


SQL_BIT 


1 


char 


;SQL_TINYINT 


4 


long 


SQL_SMALLINT 


4 


long 


SQL_INTEGER 


4 


long 


SQL_BITINT 


4 


long 


SQL_REAL 


8 


double 


SQL_FLOAT 


8 


double 


SQL_DECIMAL 


8 


double 


SQL_NUMERIC 


8 


double 


SQL_DOUBLE 


8 


double 
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SQL_DATE 6 DATE_STRUCT 

SQL_TIME 6 TIME_STRUCT 

SQL_TIMESTAMP 16 TIMESTAMP_STRUCT 

PRIVATE: control 

This control identifies and creates a variable for this script. 
Here are some samples: 



private 


int 


aa; 


private 


long 


aa; 


private 


short 


aa; 


private 


int 


aa,bb: 


private 


int 


aa=10,bb = 20; 


private 


string 


strg; 


private 


char 


chr; 


private 


char[20] 


buf; 


private 


char 


buf[201]="abcdefg"; 


private 


double 


dbl= 10.25; 


private 


float 


fit =10.25; 


private 


record 


rec = new data_object 



ENVIRONMENTAL VARIABLES : 

The system environmental variables can be used in the script just as 
any other string variable. There are also additional string variables and two reserved 
words, they are listed below: 



time$ / / 



current time 
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5 



date$ 


// 


current date 


day$ 


// 


current day 


mon$ 


// 


current month 


year$ 


// 


current year 


wday$ 


// 


current week day 


mdayS 


// 


current month day 


dir$ 


// 


base director for this federation 



Reserved words: 

TRUE / / A non zero value 

FALSE // A zero value 



#TRACE: control 

This control will activate the trace code to aid in the debug of a script. 
It will write script logic records to the script trace file as they are processed. The write 
command will also write data to the same trace file. The greater the number, the more 
information dumped to the trace file. It is presently preferred to either use a two or 
three only. 
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An example of the "trace" control is: 

#trace 2 //dump script record to trace file 

/ /before it is executed 
#import Personnel 
mainO { 
Personnel.search( 

I if (birthday == "06/14/45"); ~| 

); 

writeC'Search complete"); 

close0.mail (Charles.Hanson(alvinisys.com, "trace"); 
} / /main 

NOTE: The code in the "box" above identifies embedded script code. 
The embedded script code is contained as a parameter within the relevant method and 
will be interpreted as part of the definition of what the particular method should 
perform. 

FORMATO: fimction 

This function is used to create a character string from multiple 
arguments. The syntax and rules of sprintf apply. 

An example of the 'format' fimction is: 

private char buf[20]; 

private int cnt =25; 

private char [20] name = "total"; 
mainO } 

buf= format ("%s count =%d",name,cnt); 
) //main 
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UPPERO: function 

This fiinction is used to convert the argument character string to upper 
case characters. It does not change the argument variable. 

An example of the 'upper' function is: 

private char 

private char[20] 
mainQ { 

buf = upper( name); 
} / /main 

LOWERO: function 
This fimction is used to convert the argvmient character string to lower 
case characters. It does not change the argument variable. 

An example of the 'lower' function is: 

private char buf[20]; 

private string name = "ABCDEFGH"; 

mainO { 
buf = lower( name); 
} / /main 



buf[20] 

name = "abcdef ; 



argument 



STRIPO: fimction 

This function is used to strip off all leading and trailing spaces in the 
character string. It does not change the argument variable. 
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An example is: 

private char buf[20]; 

5 private char[20] name = " abcdef "; 

mainO { 
buf = strip( name); 
} //main 

10 

CENTERO: function 

This function is used to center the character string in the argument. It 
does not change the variable and the variable must be a character type of fixed length. 
An example of the 'center' function is: 
15 private char buf[20]; 

private char[20] name = "abcdef; 

mainO { 

buf = center( name); 
} //main 



25 



It does 
length. 



not 



LEFT 0: function 

This function is used to left justify the character string m the argiiment. 
change the variable and the variable must be a character type of fixed 
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An example is: 
private char buf[20]; 

private char[20] name = " abcdef; 

mainO { 

5 buf = left(name) ; 

} //main 



RIGHTO: function 

1 0 This function is used to right justify the character string in the 

argument. It does not change the variable and the variable must be a character type of 
fixed length. 

An example of the 'right' function is: 
private char buf[20]; 

15 private char[20] name = "abcdef ; 

mainO { 
buf = right(name) ; 
} //main 



FORD: function 

This function is the same as the 'for' function in JAVA. 
An example of the 'for' fimction is: 
private int a; 

25 mainQ { 

for(a=0; a< 10; -H-a) { 
/ / do some logic 

}; 

} / / main 
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WHILED: fimction 

This function is the same as the 'while' function in JAVA. 
An example of the 'while' function is: 
5 private int a = 0; 

mainO { 

while(a++ < 10) ) { 

/ / do some logic 
}/ /while 

10 } //main 



BREAKD: function 

This function is the same as the 'break' function in JAVA. 
15 An example of the 'break' fimction is: 

private int a; 

mainO { 

for(a=0; a<10; -H-a) { 
if (a ==5) { 
20 break 
} 

/ / do some logic 
}//for 
} //main 



25 
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CONTINUEO: function 

This fimction is the same as the 'continue' function in JAVA. 
An example of the 'continue ' function is: 

private int a; 

mainO { 

for(a=0; a<10; ++a) { 
if(a==8) { 
a = 0 
continue 
} 

/ / do some logic 
}//for 
} / / main 



EXITO: function 

This function is the same as the 'exit' function in JAVA. 
An example of the 'exit' function is: 
mainQ { 

if (more_to_do = = FALSE) { 
exitO; 

} 

} / / main 



25 
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IFQ: function 

This function is the same as the 'if fimction in JAVA. 
An example of the 'if function is: 
private int a = 0; 

5 mainQ { 

if ( a < 10 ) { 

/ / do some logic 
} else { 

/ / do some more logic 

10 } 

} / / main 



TRY: function 

1 5 This function will allow you to watch for application errors and then 

break out of the code and jump to a catch routine. It is the same as the 'try' function 
in JAVA. 

An example of the 'try' ftinction is: 
mainO { 
20 try { 

/ / do some logic 
} catch 0 { 

/ / do error handling 

} 

25 } / / main 
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CATCH: function 

This fimction is for handling error conditions. The argument of the 
control is ignored currently. 

An example of the 'catch' function is: 
5 mainO { 

try { 

/ / do some logic 
} catch 0 { 

/ / do error handling 

10 } 

} / / main 



SWITCH: function 

15 This function allows you to selectively do code depending on the value 

of a variable. It works in conjunction with case statement. It is the same as the 
'switch' function in JAVA. 

An example of the 'switch' function is: 
private int cnt = 1 
20 mainO { 

switch (cnt) { 
case 0: 
break; 
case 1: 

25 break; 

default: 
break; 
} //switch 
} //main 
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20 



CASE: function 

This function allows you to selectively do code when the value of the 
switch statement object matches the value of the case statement. It works in 
conjimction with the sv^tch statement.. It is the same as the 'case' statement in 
JAVA. 

An example of the 'case' function is: 
private int cnt = 1 
mainO { 
switch (cnt) { 
case 0: 
break; 
case 1: 
break; 
default: 
break; 
} / /switch 
} / / main 



DEFAULT: function 

This fimction will allow you to identify code as default when there is 
not a match for the case statement. It works in conjunction with switch and case 
statements. It is the same as the 'default' statement in JAVA. 
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An example of the 'default' function is: 
private int cnt = 1 
mainQ { 
switch (cnt) { 
case 0: 
break; 
case 1: 
break; 
default: 
break; 
} / /switch 
} / /main 



15 SQLINFQ: (SOL SERVER) command 

This command creates a temporary data object that contains the 

information included in the data source name and table name referenced in the 

argument The information contained in the data resulting file identifies the server 

where the data object is located, the type of data (SQL SERVER) and its data source 
20 name and table name. It also includes all column names, types and sizes. The object 

created by this command may then be added to the system with the SAVEOB JECT 

method discussed in further detail below. 

The following example illustrates connection to a SQL SERVER data 

source with the name of "tpcd" to create the metadata for the table "lineitem." This 
25 information would then be recorded as a data object ("tpcdl") in the metadata. This is 

to make it easy to create the metadata corresponding to existing SQL Server and 

Oracle files. 
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Example: 

mainO { 

sqlinfo ("tpcd:lineitem") .saveobject ("tpcdl") ; 

} / /main 



ORACLEINFO: (ORACLED commajid 

This command creates a temporary data object that contams the 
information included in the data source name and table name specified in the 
1 0 argument. The information contained in the resulting file identifies the server where 
the data object is located, the type of data (ORACLE) and its data source name and 
table name. It also includes all column names, types and sizes. The object created by 
this command could then be added to the system with the saveobject method. 

The example cormects to an Oracle data source with the name of 
1 5 "tpcd" and creates the metadata for the table "lineitem". This information is then 
recorded as a data object ("tpcd2") in the metadata. 

Example. 
mainO { 

20 oraclemfo("tpcd:lineitem).saveobj ect("tpcd2"); 

} / /main 
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MAIL: command 

This command is used to send e-mail. There are three arguments of 
which only the first one is required. The first argument is a character string 
containing tiie person's e-mail address. The second is the subject and the third 
argument is the body of the message. 

The example below searches the Personnel data object for employees 
with a birthday today identified by enviromnental variable date$. It then reads the 
records and sends an e-mail to each employee with a birthday. 

Example: 

#import Persomiel 

private record rec = new Persoimel; 

private char buf[200]; 

private string bf="hope you have a wonderful day" 
Personnel,search( 

if (birthday == "date$); 

); 

while (rec = this.readQ ) { 

buf = format ("Happy birthday %s %s", 

rec.first_name,bf); 
mail (rec.email, "birthday greetings",buf); 
} //while 
} / / main. 
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WRITE: command 

This command is used to aid in the debug of applications. The data 
specified is written to the trace file of the script. The trace command also writes to 
this same file. 

5 There are different forms to the arguments. 

write("character string only"); 
write(data_record_obj act); 

The example below will write Ifae character string with the value of cnt 
to tiie trace file. 
1 0 private int cnt = 25 

mainQ { 

write(fonnat("The value of cnt = %d", cnt )) 
} / / main 

15 

SLEEP: command 

This command will suspend the script for the amoimt of milliseconds 
specified in the argument. 

The example below will suspend the script for one minute. 
20 Example: 

mainQ { 
sleep (60000) ; 
} // main 



25 



-39- 

PATENT DOCKET UNI6-BI30/04MV1089 

CLOSE: command 

This command will close the trace file and make it the current "this" 

object. 

This example creates a trace file and e-mails it to a fictitious 

programmer. 

#trace 2 

#import Persoimel 
mainO { 
PersonneLsearch( 
if (birthday == "06/14/55"); 

); 

closeO; 

this.maiir' Charles.Hanson@.unisvs.com ." , "ERR"); 
} //main 



OBJECTS: command : 

This command will create a temporary data object that contains a list of 
all the data object names in the federation. 

The example will display using notepad a list of all data objects in this 

federation. 

mainO { 

objectsQ.displayO; 
} //main 
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FORMAT: command 

This command will format a character string. The basic rules of a 
sprintf command in C++ apply. 

The example will write the current date and time. 
Example: 

#import Client 
private char buf[ 100]; 

mainO { 

buf = fonnat("%s - - %s" , date$,time$); 

Client.write(buf); 
} / /main 



MPL: command 

This command starts either a named script or a data object, which 
contains a script, on the server specified by the second argument. If the second 
argument is omitted then it will start the script on the local server. The new started 
script runs as an independent script and will not return to the requesting script. 
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Example: 

#local view script={ } 



10 



5 



main Q { 

mpl("strip_data" ,4); 

script. write("#import Personnel 

script.write("main Q { 

script.write("Personnel.search( 

script. write(" if(state= = /"CA/"); 

script. write(") 



"); 
"); 
"); 
"); 



script.vmte("this.format(/"%s,%s/", "); 
script write("last_name, first_name) ; "; 
script. writeC'this.printO ; "; 
script. write("} 
script.closeQ ; 
mpl(script,5); 
} //main 

EXECUTE: command 

This command executes the program of the first argument on the server 



specified by the second argument. If the second argument is omitted, then it will start 
the executable program on the local server. The executable program runs as an 
independent program and will not return to the requesting script. 



25 



Example: 
main() { 



execute("xyz.exe" ,4); 
execute("xyz.bat" ,5); 
} / /main 
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AUDIO: command 

This command will play the wav file specified by the argument. The 
wav file must exist in the directory specified for messages. 

5 

Example: 
mainQ { 

audio("greetings") ; 
} / /main 



WRITE: method 

This method is used to send multiple character records to the data 
object referenced in the script. The method supplies one record at a time. 
15 Client.write("character string only"; 

Client. write(data_object); 

The example below searches the Personnel data object for employees 
with a birthday on June 14. Notice the wild card character (' [x]') is used to only 
search part of the column. The result of the search will be reformatted into three 
20 columns with commas separating them. 
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Example: 

#import Personnel 
#import Client 
mainQ { 
Personnel.search ( 

if (birthday = = "06/14/ 1"); 

) 

this.format("%s,%s,%s",last_name,first_name,ssn); 

Client. write(this); 

}// 



WRITE: method of temporary data objects 
This method is used for constructing reports that one would e-mail, 
fax, print or send to a bean. 

There are different forms to the arguments. 
Data_object.write("character string only"); 
Data_object.write(data_record_object); 

The first use of this method in a script creates a new object or erases 
the existing object if it existed. Every reference after the first adds records to the 
method's object. Before other methods can reference this new object a 'close' method 
must be performed on it. 

The example below writes two records to the temporary data object 
called temp and then sends the result to the bean. Notice that the close method is 
required before the data is sent. The second example below achieves the same results. 
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Example: 

#import Client 

#local recset temp={} 

mainO { 

5 temp.write(format(The number of records = %d", 

25)); 

temp.write (" This will be the second record of the data object called 

temp") ' 

temp.close; 
10 Client. write(temp) ; 

}// 



INFO: method 

1 5 This method creates a temporary NT file that contains the data 

definition for the method's object. The information contained in the data definition 
file identifies the server where the data object is located, the type of data (Oracle, SQL 
SERVER,NT file) and its table name and data space. It also includes all column 
names, types and sizes. 

20 Example: 

^import Personnel 



25 



mainO { 

Personnel.infoO -displayQ ; 
} / /main 
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FORMAT: method 

This method creates a temporary data object by mergfing character 
strings and data columns from the data object. The basic rules of a sprintf command in 
c-H- apply. Every record of the method's object is processed. 

The example below creates a temporary data object. The record of the 
new object contains the data from two columns (lastjtiame, first_name) separated 
with two '-' characters. The temporary data object will have the same number of 
records as the original data object and is sent to a bean. 

Example: 

#import Personnel 
#import Client 

mainO { 

Personnel.format("%s - - %s", 

last_name, first_name); 

Client. write(this); 
} / /main. 



20 LOAD: method 

This method loads the method's object with the data object in its 
arguments. If the column names of the objects do not match, the columns are left 
blank. Only the like columns are loaded. If columns are the same type but different 
lengths they are either truncated or expanded to the size of the receiving column. If 

25 the data object already existed all records will be deleted, and the data object will only 
contain the new records. If the data object did not exist, a new one will be created. If 
the column types changed from the previous data object, then a remove method must 
be called before overloading the new object. 
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Example: 

#import Personnel 
#import Payroll 



5 mainO { 

PayrolLload(Personnel) ; 
}//mail. 



10 SORT: method 

This method sorts the method's object by the column(s) in its 
arguments. If there are multiple columns, the order of the sort is determined by the 
order of the columns. Descending sort is specified by a "(d)" following the column 
name. If the arguments contain a numeric value, that value determines the maximimi 

1 5 number of records to return. 

The example below sorts the Personnel data object on state column 
(ascending) and the secondary level sort on last_name column in descending order. 
The numeric "10" specifies to return a maximum of 10 records. A maximum of 10 
records containing two columns (state and last_name) will be returned. 



20 
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Example: 

#import Client 
#import Personnel 
#local recset temp={ Personnel,state 
5 Personnel.last name } 

mainO { 

Personnel.sort( state, last_name(d),10 ) ; 
temp.load(this) ; 
Client. write(temp) ; 
10 }//main 



SEARCH: method 

This method uses an embedded script within its arguments. The 
1 5 embedded script is interpreted by the method and executed for every record in the 
method's object. The result of this method is a temporary data object containing all 
the records that match the search criteria. 

The example below searches the Personnel data object and sorts all 
records where the state column is either "CA" or "MN". The records retumed contain 
20 two columns (state and last name) that are separated by a comma. 
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Example: 

#import Client 
#import Personnel 



5 mainO { 

Personnel.search ( 

if (state = = "CA" j | state = = "MN") ; 

); 

10 this.sort(last_name); 

this.format("%s,%s",state,last_name); 

Client.write(this); 

} main 



15 



GROUPBY: method 

This method uses an embedded script within its arguments. Before the 
embedded script is interpreted by the method, the records that satisfy the request are 
20 selected and then sorted by the object(s) defined within parenthesis. The embedded 
script is then performed on the sorted records. Any variable values or objects changed 
in the embedded script will also be changed in the main script. The example below 
processes two states from the Personnel data object and then groups by state and 
returns two records containing the state and number of records for that state. 
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Example: 

#import Client 

#import Personnel 

#local recset temp={ } 

private int cnt=0 ; 

mainO { 

Personnel.groupby( (state) 

if (state = = "CA" 1 1 state = = "MN") 

{ 

if (EQUAL) { 
-H-cnt; 
} else { 

temp.write(format ("state %s = %d", 

PersonneLstate,cnt) ); 

cnt = 0 ; 

} 

} 

); 

Client.write (temp) ; 
} / /main 



EXTRACT: method 

This method uses an embedded script within its arguments. The 
25 embedded script is interpreted by the method and executed for every record in the 

method's object. Every record in the database that matches the search criteria will be 
deleted from the data object, and a temporary object containing the records will be the 
result of the method. 
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The example below will search the Personnel data object and return all 
records that state column is either "CA" or "MN". It also will delete these records 
from the data object. The records returned contain two columns (state and last_name). 

Example: 

5 #import Client 

#import Personnel 



mamO { 

Personnel.extract( 

10 if (state = = "CA" 1 1 state = = "MN") ; 

); 

this.fonnat("%s,%s",state,last_name) ; 
Client. write(this) ; 
} / /main 



SAVEOBJECT: method 

This method is used to create a new data object name or to create a data 
object on another server to allow parallel activity. The data object name identified by 
20 the arguments is added to the list of system objects available to all members of the 
system. Once the object name is added to the system it requires a load or adon 
method to insert data. The method's object is an NT file identical to the result of the 
info method. 

Example: 

25 #import Personnel 

mainO { 

PersomieLinfo0.display0.saveobject("objectx") ; 
} / /main. 
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MAIL: METHOD 

This method is used to send e-mail. The method's object is sent as an 
attachment. There are three arguments of which only the first one is required. The 
first argument is a character string containing the person's e-mail address. 
5 The example below searches the Personnel data object for employees 

with a particular birthdate. It then reads all records and sends e-mail to each 
employee with a birthday. 



Example: 

10 #import Personnel 

#import Birthdaycard 
private record rec = new Personnel; 

private char buf[200]; 

private string bf="hope you have a wonderful day"; 
15 main Q { 

Personnel. search( 
if (birthdate = = date$) ; 

); 

while(rec = this.readQ ) { 
20 buf = format("Happy birthday %s %s", 

rec.first_name,bf) ; 
Birthdaycard.mail(rec. email, 

"birthday greetings",buf) ; 

} //while 

25 } / /main 
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STATUS: method 

This method is used to send single record character strings. There are 
different forms to the arguments. 

Ciient.status("character string only"); 

Client.status(data_object); 

The example below will send a character string, "We are performing 

your request". 

Example: 

#import Client 
mainO { 

Client.status("We are performing your request"); 
} / /main 



READ: method 

This method suspends the script and waits until it receives an input. 
The method loads the variable(s) in the arguments with the input. If there are multiple 
variables, then a comma in the input string will separate the data, and the variables 
will get loaded in order. 

The example below will take a character string and parse it. The first 
part, until a comma is encoimtered, will be loaded into the bdate variable and the data 
following the comma will be interpreted as a number and loaded into the e_sal 
variable. ~ 
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Example: 

#import Personnel 
#import Client 
private char bdate[20]; 

private int e_sal; 

mainO { 

Client.read( bdate, e_sal ) ; 
Persomiel.search( 

if (birthday > bdate && salary > e_sal); 

) 

this.format("%s,%s,%s" ,last_name,first_name,ssn) ; 
Client. write(this) ; 
} // main 



SIZE: method 

This method returns the number of records in the metiiod's object. 
The example below will get the number of records from Personnel data 
object and return the value. 
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#import 
#import 

private int 



Client 
Personnel 
records=); 



mainO { 

records = PersonneLsizeQ ; 
Client.\vrite(format ("The number of records in 

Personnel = %d" .records) ) ; 

} //main 



DISPLAY: method 

15 This method displays records in a notepad window. Its purpose is to 

help in the debug of a script. 

The example below will open a notepad window with five records 
from Personnel data object. 

20 Example: 

#import Personnel 



25 



mainO { 

Personnel.display( 5 ) ; 
} //main 
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COPY: method 

This method loads the method's object into the data object in its 
arguments. If the column names of the objects don't match the columns are left 
5 blank. Only the like columns are loaded. If columns are of the same type but 

different lengths, they will either be truncated or expanded to the size of the receiving 
column. Ifthe method's object did not exist, it will be created. If the column types 
changed from the previous data object, then a remove method must be called before 
overloading the new object. 

10 

Example: 

#import Personnel 
#import Payroll 

15 

mainO { 

Personnel.copy(Payroll) ; 
} / /main 

20 

ADTO: method 

This method adds the method's object to the data object in its 
arguments. If the column names of the objects don't match the columns are left 
blank. Only the like columns are added. If the colimins are of the same type but 
25 different lengths, they vwll either be truncated or expanded to the size of the receiving 
column. 
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Example: 



#import 
#import 



Personnel 



Payroll 



mainO { 

Personnel.adto (Payroll) ; 
} //main 

ADON: method 

This method adds the data object in its arguments to the method's 



object If tile column names of the objects don't match, the columns are left blank. 
Only the like columns are added if the columns are of the same type, but different 
lengths, they will either be truncated or expanded to the size of the receiving column. 



Example: 



#import 
#import 



Payroll 



Personnel 



mainO { 

Personnel.adon (Payroll) ; 
} / /main 
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REMOVE: method 

This method removes the method's object from the system. 
Example: 

#import Payroll 
mainO { 

PersonneLremove () ; 
} //main 



MODIFY: method 

This method uses an embedded script within its argimients. The 
1 5 embedded script is interpreted by the method and executed for every record in the 

method's object. This method updates every record in the method's object that match 
the search criteria. Any variable values changed in the embedded script are also 
changed in the main script. 

The example below will add 10% to the salary column of the Personnel 
20 data object where the state colimm is either "CA" or "MN". 
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Example: 

#import Personnel 

5 mainO { 

Personnel.modify (; 

if (state = = "CA" 1 1 state = = "MN") { 
salary += (salary * .10); 

10 } 

); 

} / /main 



FIND: method 

This method uses an embedded script within its arguments. The 
embedded script is interpreted by the method and executed on every record until it 
finds a record in the method's object that match the search criteria. The method then 
20 inserts the record into a record variable for the main script to process. 

The example below will find a record in the Personnel data object that 
match the search criteria and return it into the rec variable. 
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Example: 

#import Personnel 

private record rec= new Personnel ' 

5 

mainQ { 

rec= Personnel.find ( 

if (ssn = = "476504118") 

10 ); 

} / /main 



FINDLOCK: method 
15 This method uses an embedded script within its arguments. The 

embedded script is interpreted by the method and executed on every record imtil it 
finds a record in the method's object that match the search criteria. The method then 
will lock and insert the record into a record variable for the main script to process. 

The example below will find a record in the Personnel data object that 
20 matches the search criteria and then lock it. If the lock is available, it will load the rec 
variable with the record. If the lock is not available, it will throw an exception. 
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Example: 

#import Personnel 

private record rec= new Personnel ' 

mainO { 

while(FALSE = = (rec= Personnel.findlock ( 

if (ssn = = "476504118") 

))) { 

sleep(lOOO) ; 
}/ /while 
} / /main 



UPDATE: method 

This method must be preceded by a findlock method in the same script 
as the method's object. The record object in the arguments is updated back into the 
same record found in the preceding findlock. 

The example below will find a record m the Personnel data object that 
matches the search criteria and return it into the rec variable. The phone column is 
changed and then the record is returned to the method's object. 
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Example: 

#import Personnel 

private record rec = new Personnel ; 

mainO { 

rec= Personnel.findlock ( 

if (ssn = = "476504118") 

); 

rec.phone - "425-881-5039"; 
Personnel.update(rec) ; 
} / /main 



COMPUTE: method 

This method uses an embedded script within its arguments. The 
embedded script is interpreted by the method and executed for every record in the 
method's object. Any variable values or objects changed in the embedded script are 
also changed in the main script. 

The example below will only process two states from the Personnel 
data object and then group by state and return records containing the state and last 
name and first name. 
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Example: 

#import Client 
#import Personnel 
#local recset temp = { } 
#local recset tenip2 = { } 
main 0 { 

Personnel.compute( 

if (state = = "CA" 1 1 state = = "MN") { 
if (state = = "CA") { 

temp.write(format("Califomia %s %s" , 
Persomiel.last_name, Persoimel.first_name) ) ; 
} else { 

temp2.write(format("Minnesota %s %s" , 
Persomiel.last_name, Personnel.first_name) ) ; 
} 

} 

} 

); 

temp.adon(temp2) ; 
Client. write(temp) ; 
} //main 



JOIN: method 

This method creates a view of the data defined by the group of 
columns defined in the first set of parenthesis and then merges the columns from the 
25 two records whenever the "if part of the argument is true. 

The example below will create a temp data object that contains three 
columns (last_name, first_name and city). The temp data object is then returned. 
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10 



Example: 

#import Client 

#import Personnel 

#import Zipcodes 

mainO { 

Personnel.join ( 

(last_name,first_name,Zipcodesxi1y) 

if (zip = = Zipcodes.zip ) ; 

); 

Client. write(tliis) ; 
} / /main 



MPL: method fMultinle Points of Logic^ : 

This command starts either a named script or a data object, which 
contains a script. If the method's object is a clustered object, then the script will be 
20 started on each server with "this" set to the data object that resides on that server. If 
the method's object is not a cluster, then the script will be started on the same server 
or on a server specified by a second argument with "this" set to the method's object. 
When the scripts are completed the 'THIS' object from each server is returned and 
merged into a new 'THIS' result. 



25 
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Example: 

#iinport Personnel 
#import Client 

mainQ { 

Personnel.mpl("strip_data") ; 

Client. write(this) ; 
} / /main 



SAVESCRIPT: method 

This method is used to add the method's object (which is the script) 
and the script name identified by the arguments to the list of federation scripts 
available to all members of the federation. After the script is added to the federation 
list of scripts, that script can be used as a script or as a method that applies to all data 
objects. 

Example: 

#local nt script=@scripts/example{ } 

mainQ { 

script.displayO ; 

script.savescript("scripta") ; 
} / /main 
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PRINT: method 

This method is used to send the method's object to a printer. If no 
arguments are supplied, the default printer is identified in the config file for this 
server. The example below searches the Personnel data object for employees with a 
birthday this month. It will format the records and then print. 

Example: 

#import Personnel 
mainO { 

Personnel.search ( 

if (birthmonth = = mon$ ) 

); 

this.format("%s %s" ,first_name, last)_name) ; 
this.printO ; 
} //main 



TmS: method 

This method identifies the method's object as the current "TfflS' 

object. 

Example: 

#import Payroll 

mainO { 

PayrolLthisO ; 

this.infoO -display Q ; 
} / /main 
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READ: method 

This method reads a record from the method's object and inserts it into 
a record variable. A FALSE condition is returned when there are no more records. 

Example: 

#import Personnel 

#local recset tmp-{ } 

private record rec = new Personnel ; 

mainO { 

while (rec = PersonneLreadO ) { 

tmp.write(format("%s %s" , rec.last_name, 

rec.first_name) ) ; 

} //while 

tmp.closeO .displayO ; 
} //main 



COLUMNS: command 
20 This command creates a temporary data object that contains a list of all 

the columns of the data object. 

The example will display using notepad a list of all columns in the data 

object. 

25 Example: 

#import Personnel 
mainO { 

PersonneLcolumnsQ .displayQ ; 
} //main 
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METHODS: command 

This command creates a temporary data object that contains a list of all 
the columns of the data object. 

The example will display using notepad a list of all columns in the data 

object. 

Example: 

#import Personnel 
mainO { 

Personnel.methodsO displayQ ; 
} / /main 

The following code indicates how easily the system according to the 
preferred embodiment allows new functions to be invoked. In this case, a database, 
nt_data, is being searched for a particular State, and the results are being e-mailed (via 
Exchange) to the indicated address. 

Example 

# import nt_data 

main Q { 

nt_data.search (if (state == "CA")).mail 

("Bob.Hall@imisys.com") 

} 



As another example, one may search a database containing the 
population of the United States for persons with particular attributes, sort the results 
of the search, and fax the results of the sort - in one easy statement. If the database 
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containing the popidation of the United States is distributed, the search and sort will 
be run automatically in parallel. 

In installing the apparatus according to the preferred embodiment, one 
first defines a number of environmental variables. The value of FED_federation 
number points to the "trader" for the federation. The trader contains a map network 
drive or network share. The drive contains the scripts, data object definition files 
errors and other messages for this federation. It also contains a file which identifies 
the servers in the federation. The latter is called the Federation file. 

This and the following information is contained in a repository (acting 
as a "trader"). The system administrator is guided, using a graphical user interface, 
through the specification of this configuration information. A repository application 
copies the information to the indicated local files (local to each server). 

The value, FED_federation number.server number, identifies a base 
directory for the server "server number" within the federation "federation number". 
The apparatus uses the base directory for temporary files, scripts, data objects and NT 
files. The value of environmental variable, _PATH, provides the location of the 
apparatus executable. 

The following illustrates a typical federation file, listing the servers 
within the federation. (This file is contained within the "trader", pomted to by 
FEDfederation number.): 

Federation Name = 1 
Server = 10 "server name 10" 
Server = 1 "server name 1" 
Server = 2 "cluster 2" 
Server == 3 "server name 3" 
Server = 4 "server name 4" 
Server = 5 "server name 5" 
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The following illustrates a typical server configuration file, contained 
within the base directory for the server, pointed to by "_FED federation.server." 

Typical configuration file 
Federation = 1 
Listen on Port = 3200 
My server = 10 
Debug = 0 
Sound = 1 

Inprnsgtime = 500 
Pre_start_processor = 1 



"Pre_start_processor " indicates the initial number of agent processes 
in the server. If additional processes are required, they will be generated dynamically. 
The "Inp.msg.time" parameter is a time out value. 

The Debug parameter specifies levels of debugging information to be 
written to the trace file (the parameter has values from 1 to 10). The Sound parameter 
indicates whether audio error and other messages should be provided. 

When the server is installed, the apparatus copies the federation file, all 
object definition files, and all scripts from the map network drive or network share 
(that is, from the "trader") to the local subdirectories on the server. A "start service" 
is used to start or stop the servers in a federation. 

Figures 1 OA- IOC illustrates an alternate user interface in the form of a 
Java Studio based integrated development environment in which the apparatus 
(messenger, agent, assistant) is invoked as a Java Bean. A version of the apparatus 
invokable as a Java Bean is provided so that, in this case, the apparatus appears to be a 
Java Bean and may be invoked by, or invoke, other Java Beans. 
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The Java Studio™ contains a design panel (bottom right in Figure 
IOC) and a GUI panel (bottom left in Figure lOB). It supplies a pallet of Java Beans. 
When a Java Bean is dropped on the design panel, if it has a corresponding GUI 
element (e.g., the Java Bean might contain a GUI text field), the latter will appear in 
5 the GUI panel (and may be moved, resized, etc.). When a bean is selected, there is the 
opportunity, via a dialogue box, to customize it. Thus, for example, a bean providing 
arithmetic functions might be customized with different operators, new expressions, 
etc., while GUI beans might be customized with respect to size, colors, background 
colors, fonts, etc., and other attributes, 

10 Methods in the beans are exposed as connectors, so that the beans can 

be wired together visually. Components communicate by sending and receiving 
messages through the connectors. Messages may be a value (normal input, output); a 
trigger (e.g., a signal to start an action);. . . 

Having created a design, one can generate an applet, standalone 

1 5 application, Java bean, or "packaged design" (a single representation of the whole 
design, which may be used in fiiture designs). (An enhancement to the integrated 
development environment might show, for example, the web page into which the new 
applet - if generation of an applet were selected - may be dropped.) 

The GUI is active, so that, as the design is built, the resulting solution 

20 may also be tested. Key methods disclosed above (e.g., search, sort,. . .) are made 
available as Java Beans or ActiveX components, and are usable independently. The 
apparatus itself is available as a Java Bean or ActiveX component, driven by a simple 
script, and may be used to create applications (business objects) of any degree of 
complexity. The apparatus supports the construction of full mission critical 

25 transaction or decision support solutions. 

If the referenced data is distributed, the metiiods are mvoked 
transparently, as parallel services - so that one may have transparent parallel data 
access and execution of methods across data distributed on the SMP nodes of a Unisys 
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Cellular Multiprocessor, cluster, or network. The transparently accessed data may be 
heterogeneous: contained, for example, in NT files, SQL Server, Oracle,.. . 

In the diagram of Figures 1 lA-1 ID, a search bean is shown driven by 
the indicated script. A "Personnel" database is imported, and is searched - with the 
command Personnel. search - for persons with a particular birthdate. If the 
"Personnel" database is distributed, the search methods will be run, automatically in 
parallel, across the distributed data. The search bean is easily customized by 
modifying the script. The example shows the supplied bean(s) being invoked by, and 
invoking, beans provided by Java Studio or by other bean "factories." 

The considerable increase in performance of a system constructed 
according to the preferred embodiment may be illustrated in connection with an actual 
example concerning mining of the population of the United States. In this example, 
five million records, spread across 15 nodes, were searched, in serial, in 3 minutes, 27 
seconds. Usmg a parallel search according to the preferred embodiment, the time was 
17 seconds. If indices are built for the attributes of interest and searched in parallel, 
the search time is 4 seconds. If the indices are then cached, and the cached indices are 
searched in parallel, the search time is 1.2 seconds (some 172 times faster than the 
original serial search). 

The times will often be linear, so that ten million records on 30 nodes, 
or 2,500,000 records on 8 nodes, will be searched in the same 1.2 seconds. The 
ability to search the population of the United States in less than 16 seconds portends a 
tremendous opportunity for completely new applications and solutions. 
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Another example of increased performance is a search of 15 million 
records across five nodes is illustrated in the following table: 



Serial 

NT file system -search 6 min, 32 sec. 

-sort 26 min, 26 sec. 

-indexed 

search 



Parallel Improvement 

(preferred Times 
embodiment) 

1 min. 22 sec. 4.44 
6 min., 8 sec. 4.3 1 

3 sec. 130 



SQL Server -search 6 min., 1 1 sec. 1 rain., 20 sec. 4.64 

-sort 2 hr., 38 sec. 25 min., 20 4.76 



sec. 



Oracle -search 14 min., 5 sec. 3 min., 8 sec. 4.49 

-sort 3 hrs., 17 min., 36 min., 18 5.45 

43 sec. sec. 



In the above example, servers were installed on five Aquanta systems, 
connected to a PC client with 1 OT Ethernet. The installation of the servers 
transformed the environment into a cluster or multiprocessor federation. In this way, 
the servers may be used, for example, to support a virtual data warehouse (the data in 
which may reside in heterogeneous databases where a database need not scale beyond 
the largest server). 

As noted above, the preferred embodiment employs a function 
shipping model. An alternative model is a data shipping model in which, rather than 
the functions being performed as close to the data as possible, and only the results 
(e.g., the number of "hits" in a search) being returned to the requester, the data to be- 
searched is passed to the requester and the function is performed there. This involves 
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the transfer of much more data about the network, and, even with very high network 
or backplane bandwidths, typically increases the latency of the operations. The 
implemented model also ensures that the latest version of the data in question never 
resides in a cache on a different node (thus eliminating cache coherency problems). 

Certain design considerations may also be noted at this point. Critical 
in terms of scaling and performance are concurrency (the degree of parallelism), 
contention (the degree of serialization) and coherency (the cost of consistency). An 
aim is to minimize contention and the cost of consistency. While one cannot totally 
eliminate contention, it is found that by performing updates as close to the data as 
possible (treating that server as largely independent of the others), and by 
randomizing the distribution of data across the servers (e.g., using hashing 
techniques), the contention is usually low (shown in almost linear scaling). 

The cost of coherency (not having a later update in someone else's 
cache) is a quadratic term, placing a maximum on the possible scaling as servers are 
added. By performing updates on the server containing the data, one ensures that 
there will never be later updates in the cache on other servers, ensuring cache 
coherency, and eliminating this cause of a maximum in the possible scaling. 

Additionally, the preferred embodiments discussed above have 
employed the creation of data descriptor files from the metadata at run-time. In 
alternate embodiments, all files can be held in the repository, which is then accessed 
from each server at run-time or a distributed repository system may be used where 
lightweight versions of the repository reside on the local server nodes. 

The methods and apparatus of the present invention, or certain aspects 
or portions thereof, may take the form of program code (i.e., mstructions) embodied in 
tangible media, such as floppy diskettes, CD-ROMS, hard drives, or any other 
machine-readable storage medium, wherein, when the program code is loaded into 
and executed by a machine, such as a computer, the machine becomes an apparatus 
for practicing the invention. The methods and apparatus of the present invention may 
also be embodied in the form of program code that is transmitted over some 
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transmission medium, such as over electrical wiring or cabling, through fiber optics, 
or via any other form of transmission, wherein, when the program code is received 
and loaded into and executed by a machine, such as a computer, the machine becomes 
an apparatus for practicuig the invention. When implemented on a general-purpose 
5 processor, the program code combines with the processor to provide a unique 
apparatus that operates analogously to specific logic circuits. 

Those skilled in the art will appreciate that various adaptations and 
modifications of the just-described preferred embodiments can be configxured without 
departing firom the scope and spirit of the invention. Therefore, it is to be imderstood 
10 that within the scope of the appended claims, the invention may be practiced other 
than as specifically described herein. 
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WHAT IS CLAIMED IS: 



1 1 . A method of accessing and operating upon heterogeneous data at a plurality of 

2 nodes comprising the steps of: 

3 (1) propounding a request containing a data source object name 

4 wherein the heterogeneous data is treated as a single data source object, said 

5 request further contaming at least one method to be performed on the data 

6 source object and at least a second method to be performed on the results 

7 produced by performance of the first method; 

8 (2) determining whether the data source object is distributed across 

9 a plurality of nodes; and 

10 (3) making a determination as to whetiier said second method 

1 1 should be performed on said results at each respective node or should be 

12 performed at the user site after said results are transmitted jfrom each node 

1 3 back to the user site. 

1 2. The method of Claim 1 wherein, if it is determined that the data source object 



2 is distributed, and said second method should be performed at the respective nodes, 

3 the request is broken into a plurality of new requests, each of said new requests 

4 including code representing said first and second methods and having a format 

5 appropriate to one of the respective nodes where the data source object resides. 

1 3. The method of Claim 2 further comprising the steps of: 

2 transmitting said new requests to said nodes; 

3 executing the first method concurrently on the data source object at the 

4 corresponding nodes; 

5 temporarily storing the results of execution of the first method; and 

6 executing the second method on said results, said step of executing 

7 being performed at each of said nodes where the data source object resides. 
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1 4. The method of Claim 1 wherein a first agent process at the user site performs 

2 the step of making a determination as to whether the second method should be 

3 performed at each respective node. 

1 5. The method of Claim 4 wherein, in performing the step of determining 

2 whether the data source object is distributed, the first agent process consults a data 

3 so\u:ce descriptor file containing a subset of data contained in a first repository of 

4 metadata. 

1 6. The method of Claim 5 wherein a remote agent process automatically executes 

2 said first method, automatically stores the results produced by executing said first 

3 method, and automatically executes said second method on said results. 

1 7. The method of Claim 6 wherein the results of execution of said second method 

2 are automatically returned to a the user site and, automatically merged by said first 

3 agent process, and wherein a third method is then automatically executed on the 

4 merged residts by a said first agent process. 

1 8. The method of Claim 7 wherein said first, second and third methods 

2 respectively comprise a search of the data object, a sort of tiie results of the search, 

3 and an e-mail of the merged results of the search. 

1 9. The method of Claim 5 wherein the data source descriptor file is created fi-om 

2 the repository at run-time. 

1 1 0. The method of Claim 3 wherein a first messenger process cooperates with said 

2 first agent process to transmit each said new request to its respective node. 

1 11. The method of Claim 1 0 wherein said request is in the form of a script and 

2 each said new request is in the form of a script having said format 

1 12. The method of Claim 1 1 wherein said script and said new scripts are each in_ 

2 the form of a Java script. 
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1 13. The method of Claim 1 1 wherein each of said nodes has associated therewith a 

2 respective database, and a respective agent process, each respective agent process 

3 comprising code selected to execute the respective new script with respect to the data 

4 source object as it is contained in the respective database. 

1 14. The method of Claim 1 3 wherein each of said databases is different from the 

2 remaining respective databases. 

1 15. The method of Claim 1 4 wherein the respective databases comprise at least 

2 two databases, each selected from the following group: Oracle database, NT database 

3 and SQL Server. 

1 16. The method of Claim 1 3 wherein each respective agent process accesses 

2 metadata located at the respective node in the course of executing the respective new 

3 script at that node. 

1 17. The method of Claim 1 6 wherein a data source descriptor file is created from 

2 the metadata at each respective node for use by the respective agent process. 

1 18. The method of Claim 1 6 wherein the metadata comprises a collection of data 

2 source objects which reflect treatment of data stored in each respective database as a 

3 single object and wherein each of said data source objects is broken down into 

4 successive class levels. 

1 19. The method of Claim 1 8 wherein said class levels include a class comprising a 

2 System Node, System Server, Data Source Object, Field Desc and System Script. 

1 20. The apparatus comprising: 

2 a plurality of databases, each located at a different node, 

3 data processing apparatus at each node, each data processing apparatus including an 

4 agent code module, each agent code module being constructed to determine whether a 

5 data source object presented in a script is distributed across a plurality of said nodesT 

6 and if said object is distributed, to break the script into a plurality of new scripts 
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7 suitable for execution at a respective one of said nodes, aad if the script contains a 

8 plurality of successive methods, to determine whether a second of said methods 

9 should be performed at the respective node on the results of execution of a first of said 
1 0 methods at said respective node 

1 21. The apparatus of Claim 20 further including a messenger code module at each 

2 node for facilitating transmission of each of said new scripts to its respective node. 

1 22. The apparatus of Claim 20 where each said agent code module is responsive to 

2 said script to exfract the data corresponding to said data source object from said 

3 database and to execute a method contained in said script on said data. 

1 23. The apparatus of Claim 22 further including a repository of metadata at each 

2 said node, said metadata describing the contents of the data source objects available at 

3 that node. 

1 24. The apparatus of Claim 22 further including a repository application at each 

2 node for creating a data source descriptor file for use by the agent in extracting data 

3 from the database at that node. 

1 25. An article of manufacture comprising: 

2 a computer usable mediimi having computer readable program code 

3 means embodied in said medium for accessing and executing a plurality of methods 

4 on data at each of a plurality of nodes, said data being treated as a single data source 

5 object, the computer readable code means comprising: 

6 means for receiving a request containing a data source object name 

7 wherein heterogeneous data is treated as a single data source object, said request 

8 further containing a plurality of methods to be performed on the data source object; 

9 and 

10 means for determining whether to execute a second method contained 

1 1 in one of said new requests at a first of said nodes upon the results of execution of a" 

12 first of said methods at said first node. 
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1 26. The article of Claim 25 further including means for determining whether the 

2 data source object is distributed across a plxorality of nodes; and if the data source 

3 object is determined to be distributed, breaking the request into a plurality of new 

4 requests, each of said new requests including code representing said second method if 

5 said means for determining determines that said second method should be executed at 

6 each of said plurality of nodes, each new request having a format appropriate for 

7 execution at a respective one of said plurality of nodes. 

1 27. The article of Claim 26 wherein said computer readable code means further 

2 contains means for automatically executing said first method, automatically storing 

3 the results of the execution, and then automatically executing said second method 

4 upon the stored results and returning the results of execution of said second metiiod to 

5 a site which transmitted said request. 

1 28. The article of Claim 25 wherein said computer readable code means further 

2 includes means for automatically executing a third method included in a said request 

3 on the returned results. 

1 29. A data processing apparatus comprising: 

2 means at a user node for receiving a request containing a data source object 

3 name wherem heterogeneous data located at a plurality of remote nodes is treated as a 

4 single data source object, said request further containing a first method to be 

5 performed on the data source object at each of the nodes where said data source 

6 object resides and a second method to be performed on the results of performing the 

7 first method on the data source object; and 

8 means for determining whether to execute said second method at each 

9 of the remote nodes where said data source object resides or at said user node. 

1 30. An article of manufacture comprising: 

2 a computer usable medium having computer readable program code 

3 means embodied in said medium, the computer readable code means comprising: 
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4 means for receiving a request containing a data source object name 

5 wherein heterogeneous data is treated as a single data source object, said request 

6 further containing a plurality of methods, a first of said methods to be performed on 

7 the data source object; and 

8 means for determining whether the data source object is distributed 

9 across a plurality of nodes, and if so, for determining whether to execute a second of 

10 said methods at each of the nodes where the data source object resides upon the results 

11 of execution of a said first of said methods. 

1 31. The article of Claim 30 wherein said code means further comprises means for 

2 generating a plurality of new requests suitable for execution at the nodes where the 

3 data object resides and containing the methods to be executed there. 

1 32. TTie article of Claim 30 wherein said means for determining further determines 

2 whether the data is stored locally or is nondistributed but stored remotely and means 

3 responsive to either fiirther determination to extract the data and execute said first 

4 method upon the data. 

1 33. The article of Claim 30 wherein said computer readable code includes an agent 

2 process which performs the function of determming, and which consults a data source 

3 descriptor file containing a subset of data contained in a first repository of metadata. 

1 34. Computer executable process steps operative to control a computer and stored 

2 on a computer readable medium, comprising: 

3 a step to receive a request containing a data source object name 

4 wherein heterogeneous data stored at a plurality of nodes is treated as a single data 

5 source object, said request further containing a first method to be performed on the 

6 data source object and a second method to be performed on the results produced by 

7 performance of said first method; 

8 a step to determine whether the data source object is distributed across 

9 a plurality of remote nodes; and — 
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10 a step wherein, if the data source object is determuied to be distributed, 

11 a determination is made as to whether each of said first and second methods should be 

1 2 performed at the plurality of remote nodes. 

1 35. The process steps of Claim 34 further including: 

2 a step wherein, if it is determined that the data source object is 

3 distributed and that said first and second methods should be performed at the remote 

4 nodes, said request is broken into a plurality of new requests, each containing code 

5 representing said first and second methods. 

1 36. The process steps of Claim 34 ftirther including a step to return results of 

2 execution of said second method back to a location where said request originated. 

1 37. The process steps of Claim 34 including a step to merge results received at 

2 said location and a step to execute a third method on those results. 

1 38. The process steps of Claim 43 wherein said first, second and third methods 

2 respectively are search, sort and e-mail. 

1 39. A method of accessing and operating upon data in a system wherein the data to 

2 be accessed is stored at one or more nodes, said nodes including a user site, and a 

3 plurality of remote nodes the method comprising the steps of: 

4 (1) inputting a script at the user site, the script containing a data 

5 source obj ect name, said script further containing a plurality of methods, at 

6 least one method to be performed on the data source object and one or more 

7 additional methods each to be successively performed on the results of the 

8 preceding method; and 

9 (2) executing said first method on the data source object at each 

1 0 node where it resides, and automatically executing each additional method on 

11 the results of execution of the preceding method. _ 
1 40. The method of Claim 39 wherein said step (2) of executing comprises: 
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2 (3) interpreting said script so as to generate a plurality of new 

3 scripts, each new script being appropriate for execution at a respective one of 

4 said nodes; 

5 (4) transmitting said new scripts in parallel to each of their 

6 respective nodes; and 

7 (5) automatically executing at least one of said plurality of methods 

8 in parallel at the said respective nodes on the results of executing a preceding 

9 one of said methods at said respective nodes. 

1 41. The method of Claim 40 further comprising the step of automatically returning 

2 the results of executing said at least one method to said user site. 

42. A method of accessing and operating upon data in a system wherein the data to 



be accessed is stored at one or more nodes, said nodes including a user site, and a 
plurality of remote nodes the method comprising the steps of: 

(1) inputting a script at the user site, the script containing a data 
source object name, said script further containing a plurality of methods, at 
least one method to be performed on the data source object and one or more 
additional methods, each additional method to be successively performed on 
the results of the preceding method; 

(2) executing said first method on the data source object at each 
node where it resides; and 

(3) pipelining the results of executing said first method to said 
second method for execution of said second method thereon. 

43. The method of Claim 42 wherein said step of pipelining comprises storing said 
results at each node where the data source object resides for execution at that 
respective mode. 
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ABSTRACT OF THE DISCLOSURE 

Heterogeneous data at a plurality of remote nodes is accessed automatically in 
parallel at high speed from a user site using a simple script request containing a data 
source object name wherein the heterogeneous data is treated as a single data source 
object, the script further cont£uning at least one method to be performed on the data 
source object and at least a second method to be automatically performed on the 
results of executing the first method. A user site agent breaks the user-generated 
script into new scripts appropriate for execution at the remote nodes and determines 
whether the second method should be executed at the remote nodes or at the user site. 
A messenger process transmits the new scripts to the appropriate remote nodes where 
respective agent processes respond to automatically access the appropriate data and to 
automatically execute the specified methods. 
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